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Abstract. In this paper, we analyze the local clustering coefficient of 
preferential attachment models. A general approach to preferential at¬ 
tachment was introduced in [18], where a wide class of models (PA-class) 
was defined in terms of constraints that are sufficient for the study of 
the degree distribution and the clustering coefficient. It was previously 
shown that the degree distribution in all models of the PA-class follows 
a power law. Also, the global clustering coefficient was analyzed and a 
lower bound for the average local clustering coefficient was obtained. We 
expand the results of m by analyzing the local clustering coefficient for 
the PA-class of models. Namely, we analyze the behavior of C(d) which 
is the average local clustering for the vertices of degree d. 

Keywords: networks, random graph models, preferential attachment, 
clustering coefficient. 


1 Introduction 


Nowadays there are a lot of practical problems connected with the analysis of 
growing real-world networks, from Internet and society networks mm to biolog¬ 
ical networks [2]. Models of real-world networks are used in physics, information 
retrieval, data mining, bioinformatics, etc. An extensive review of real-world 
networks and their applications can be found elsewhere (e.g., see HEEE]). 

It turns out that many real-world networks of diverse nature have some typ¬ 
ical properties: small diameter, power-law degree distribution, high clustering, 
and others I13I16I17I23| . Probably the most extensively studied property of net¬ 
works is their vertex degree distribution. For the majority of studied real-world 
networks, the portion of vertices with degree d was observed to decrease as d -7 , 
usually with 2 < 7 < 3 mmmm- 


Another important characteristic of a network is its clustering coefficient, 
which has the following two most used versions: the global clustering coefficient 
and the average local clustering coefficient (see Section 2.3 for the definitions). 
It is believed that for many real-world networks both the average local and the 
global clustering coefficients tend to non-zero limit as the network becomes large. 


* This is an extended version of the paper appeared in Proc. WAW’15, LNCS 9479, 
pp. 15-28, 2015. 
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Indeed, in many observed networks the values of both clustering coefficients are 
considerably high Em. 

The most well-known approach to modeling complex networks is the prefer¬ 
ential-attachment idea. Many different models are based on this idea: LCD 7\, 
Buckley-Osthus uni, Holme-Kim El. RAN 123], and many others. A general 
approach to preferential attachment was introduced in E3, where a wide class 
of models was defined in terms of constraints that are sufficient for the study of 
the degree distribution (PA-class) and the clustering coefficient (T-subclass of 
PA-class). 

In this paper, we analyze the behavior of G(d) — the average local clustering 
coefficient for the vertices of degree d - in the T-subclass. It was previously 
shown that in real-world networks C(d) usually decreases as d~^ with some 
parameter if) > 0 111 120122] . For some networks, C(d) scales as a power law 
C(d) ~ d -1 (15119] . In the current paper, we prove that in all models of the 
T-subclass the local clustering coefficient C(d) asymptotically behaves as C ■ 
d _1 , where G is some constant. We also illustrated these results empirically. In 
addition, we suggested and empirically verified (for A < 0.75) an approximation 
for the average local clustering coefficient C 2 .{n). 

The remainder of the paper is organized as follows. In Section [2j we give 
a formal definition of the PA-class and present some known results. Then, in 
Section |3j we state new results on the behavior of local clustering C(d). We 
prove the theorems in Section [4] In Section [5] we make some simulations in order 
to illustrate our results for C(d) and to empirically analyze the local clustering 
coefficient. Section [6] concludes the paper. 


2 Generalized Preferential Attachment 


2.1 Definition of the PA-class 


In this section, we define the PA-class of models which was first suggested in [T5j . 
Let GJ^ (n > no) be a graph with n vertices {1,..., n} and mn edges obtained 
as a result of the following process. We start at the time no from an arbitrary 
graph G^ with no vertices and mno edges. On the (n + l)-th step (n > no), 
we make the graph G^ +1 from G^ by adding a new vertex n + 1 and m edges 
connecting this vertex to some m vertices from the set {1,..., n, n + 1}. Denote 
by d" the degree of a vertex v in G^. If for some constants A and B the following 
conditions are satisfied 
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P«+l =m + j) = 0 , 1 < j < m , (4) 

then the random graph process G ^ is a model from the PA-class. Here, as in [IS] , 
we require 2mA + B = m and 0 < A < 1. 

As it is explained in |18| . even fixing values of parameters A and m does 
not specify a concrete procedure for constructing a network. There are a lot of 
models possessing very different properties and satisfying the conditions |l|[4]), 
e.g., the LCD, the Buckley-Osthus, the Holme-Kim, and the RAN models. 


2.2 Power Law Degree Distribution 

Let N n (d) be the number of vertices of degree d in G^. The following theorems 
on the expectation of N n (d) and its concentration were proved in [151 . 

Theorem 1. For every model in PA-class and for every d > m 

EN n (d) = elm, d) + O (d 2+ a 'j'j , 

where 

( . _ r(d+ g) r (m + gAI) ^ r(m + ^)d^ 1 -i 

1 ’ AV (d+ M± £ ±1 ) r(m + f) Ar(m+f) 

and T( x) is the gamma function. 

Theorem 2. For every model from the PA-class and for every d = d(n) we have 

P (| N n (d) - E N n (d)\ > d \fn logn) = n - fi(Iogn) . 

Therefore, for any 6 > 0 there exists a function <p(n) G o(l) such that 

lim P (3d < n : | N n {d) - E7V n (d)| > <p(n) EN n {d)) = 0 . 

These two theorems mean that the degree distribution follows (asymptotically) 
the power law with the parameter 1 + . 


2.3 Clustering Coefficient 


A T-subclass of the PA-class was introduced in T8j. In this case, the following 
additional condition is required: 


D 


P (dr +1 = d? + 1, d] +1 =<% + 1 I G n m ) = ey — + O 




(5) 


Here ej j is the number of edges between vertices i and j in G 7 ^ and D is a positive 
constant. Note that this property still does not define the correlation between 
edges completely, but it is sufficient for studying both global and average local 
clustering coefficients. 
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Let us now define the clustering coefficients. The global clustering coefficient 
C i (G) is the ratio of three times the number of triangles to the number of pairs of 
adjacent edges in G. The average local clustering coefficient is defined as follows: 
G 2 (G) = I X^ l =i C(z), where C(i) is the local clustering coefficient for a vertex 

' i: C(i) = A, where T l is the number of edges between neighbors of the vertex i 
*2 

and PJ is the number of pairs of neighbors. Note that both clustering coefficients 
are defined for graphs without multiple edges. 

The following theorem on the global clustering coefficient in the T-subclass 
was proven in Pi- 

Theorem 3. Let Gj^ belong to the T-subclass with D > 0. Then, for any e > 0 

(1) If2A < 1, then whp m ^B^~ 1 1} < <M<^) < ^^li) ! 

(2) If 2A = 1, then whp m ( 4 ( A+ sj+m-1) log n — ^ m(4(A+B)+^-l) Iogn > 

(3) If2A > 1, then whp n l ~ 2A ~ e < Gi(G^) < n l ~ 2A+e . 

Theorem [3] shows that in some cases (2A > 1) the global clustering coefficient 
Gi(GJjj) tends to zero as the number of vertices grows. 

The average local clustering coefficient G 2 (G^) was not fully analyzed pre¬ 
viously, but it was shown in iXS that G 2 (GJ^) does not tend to zero for the 
T-subclass with D > 0. In the next section, we fully analyze the behavior of the 
average local clustering coefficient for the vertices of degree d. 


3 The Average Local Clustering for the Vertices of 
Degree d 


In this section, we analyze the asymptotic behavior of C(d) — the average local 
clustering for the vertices of degree d. Let T n (d) be the number of triangles on 
the vertices of degree d in G^ (i.e., the number of edges between the neighbors 
of the vertices of degree d). Then, G(d) is defined in the following way: 


C(d) 


T n (d) 

JVn (d)® ' 


( 6 ) 


In other words, C(d) is the local clustering coefficient averaged over all vertices 
of degree d. In order to estimate C{d) we should first estimate T n (d). After that, 
we can use Theorems [l] and [ 2 ] on the behavior of N n (d). 

We prove the following result on the expectation of T n (d). 


Theorem 4. Let G^ belong to the T-subclass of the PA-class with D > 0. Then 

(1) if2A < 1, then E T n (d) = K(d) (n + O (d 2+ ^))/ 

(2) if2A = 1, then ET„(d) = K(d) (n + O (d 2+ i ■ log(ri))); 

(3) if 2A > 1, then E T n (d) = K(d) (n + O ( d 2+ i ■ rA 4 ” 1 )) ; 
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where K(d) = c(m, d) (d + £ • £- = m ^+b) 


d Zt°? _J2_ . r ( m+:g a l1 ) . c f-\ 


Am Ar(m+f) 


Second, we show that the number of triangles on the vertices of degree d is 
highly concentrated around its expectation. 


Theorem 5. Let G ^ belong to the T-subclass of the PA-class with D > 0. Then 
for every d = d(n) 

(1) if2A < 1: P (| T n (d) - ET n (d)\ > d 2 yfh logn) = n ~ n ^ n 1; 

(2) if2A = l:P (| T n (d) - ET n (d,)\ > d 2 y/n log 2 n) = n -^°s n ) ; 

(3) if2A > 1: P (\T n (d) - ET n (d)| > d 2 n 2A ~i logn) = n ~ n ^ n \ 


Consequently, for any 5 > 0 there exists a function tp(n) = o(l) such that 

(1) if2A < 1; lim^oo P (lid < u^a +2 : \T n {d) - ET„(d)| > tp(n) ET n (d)) = 0; 

(2) if 2 A > 1: 

limn^oo P (^3 d < n 4A + 2 : \T n (d) - ET„(d)| > <p(n) ET n (d)j = 0. 

As a consequence of Theorems [T] [2j [4j and [5] we get the following result on 
the average local clustering coefficient C{d) for the vertices of degree d in G ™ . 


Theorem 6. Let G^ belong to the T-subclass of the PA-class. Then for any 
6 > 0 there exists a function <f(n) = o(l) such that 


(1) if2A < 1: lim^oo P 

(2) if 2A > 1: lim^oo P 
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It is important to note that Theorems ^ and ^ are informative only for A < 
j, since only in this case the value n 4A + 2 grows. This restriction seems 
technical, i.e., one may think that more accurate estimation of error terms may 
fill the gap between | and 1. However, as we discuss in Section [ 5 J it seems that 
for A > | the error terms can make a significant contribution to G(d) and the 
obtained asymptotic may not work. This means that it is probably impossible 
to estimate C(d) in the whole T-subclass for A > | and additional constraints 
are needed. 

In the next section, we first prove Theorem [4] Then, using the Azuma- 
Hoeffding inequality, we prove Theorem [5l Theorem [6| is a corollary of Theo¬ 
rems 0 [2J 0 and 


4 Proofs 

In all the proofs we use the notation d(-) for error terms. By 9(X) we denote an 
arbitrary function such that |0(X)| < X. 
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4.1 Proof of Theorem [4] 

We need the following auxiliary theorem. 

Theorem 7. Let W n be the sum of the squares of the degrees of all vertices in 
a model from the PA-class. Then 

(1) if 2A < 1, then EW n = O(n), 

(2) if2A = 1, then E W n = 0(n ■ log (n)), 

(3) if2A > 1, then E W n = 0{n 2A ). 

This statement is mentioned in Il8l and it can be proved by induction. Also, 
let S(n, d) be the sum of the degrees of all the neighbors of all vertices of degree 
d. Note that S(n, d ) is not greater than the sum of the degrees of the neighbors 
of all vertices. The last is equal to W n . because each vertex of degree d adds d 2 
to the sum of the degrees of the neighbors of all vertices. So, for any d we have 

ES(n,d) < EW n . (7) 

Now we can prove Theorem|4j Note that we do not take into account the mul¬ 
tiplicities of edges when we calculate the number of triangles, since the clustering 
coefficient is defined for graphs without multiple edges. This does not affect the 
final result since the number of multiple edges is small for graphs constructed 
according to the model [5j. 

We prove the statement of Theorem 4 by induction on d. Also, for each d we 
use induction on n. First, consider the case d = to. The expected number of tri¬ 
angles on any vertex t of degree to is equal to E ]T^,. ^ (eij ^ + O j 

(see 0). As G^ has exactly mt edges, we get E E(i,j)e.E(G‘ m ) + 0 (^r-)) 

= D + o(l). The fact that EI2(i,j)eE(G^) ® (%r-) = O (^ L ) = o(l) can be 
shown by induction using the conditions (1-4). We also know (see Theorem [lj 
that ElV n (TO) = c ( to , to) n+O (1). So, ET n (m) = (D + o(l)) ( c ( to , to) n + O (1)) 
= K(m ) (n + O (1)). This concludes the proof for the case d = m for all values 
of A (2 A <1,2A=1 and 2 A > 1). 

Consider the case d > m. Note that the number of triangles on a vertex of 
degree d is O(d), since this number is 0(1) when this vertex appears plus at 
each step we get a triangle only if we hit both the vertex under consideration 
and a neighbor of this vertex, and our vertex degree equals d, therefore we get 
at most dm triangles. Also, EN n (d) = c{m,d) (n + O So we have 

ET n (d) = O(d) c(m,d) (n + O ^g? 2+ 3^. In particular, for n < Q ■ d 2 (where 
the constant Q depends only on A and to and will be defined later) we have 
E T n (d) = O (c(m,d) d 3+ i^j = O (d 2 ) = K{d) ■ O (d 2+ i^. This concludes the 

proof for the case d > to, n < Qd 2 for all values of A. 

Now, consider the case d > to, n > Q d 2 . Once we add a vertex n + 1 and to. 
edges, we have the following possibilities. 
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1. At least one edge hits a vertex of degree d. Then T n (d) is decreased by the 
number of triangles on this vertex (because this vertex is a vertex of degree d +1 
now). The probability to hit a vertex of degree d is Ad + B + O Summing 

over all vertices of degree d we obtain that E T n (d) is decreased by: 


Ad + B 


+ o -Er n (d). 


( 8 ) 


2. Exactly one edge hits a vertex of degree d — 1. Then T n {d) is increased by 
the number of triangles on this vertex. The probability to hit a vertex of degree 
d — 1 once is equal to ' 4 ^ d ~^ +B + O (^£ 2 ) • Summing over all vertices of degree 
d — 1 we obtain that the value ET n (d) is increased by: 


A(d-1) + B 


°(^))-ET„(d-l). 


(9) 


3. Exactly one edge hits a vertex of degree d — 1 and another edge hits its 
neighbor. Then, in addition to Q, T n {d) is increased by 1. The probability to 

hit a vertex of degree d—1 and its neighbor is equal to ^ + O ^ ^ d ~^2 d ' ) > where 
di is the degree of this neighbor. Summing over the neighbors of a given vertex 
of degree d—1 and summing then over all vertices of degree d—1 we obtain that 
ET„(d) is increased by: 

(d-l)EJV n (d-l) — +0 

nnn \ 

= (d 


E i:i is a neighbor 

of a vertex of degree d—1 


l)EV n (<i-l) — + o( d -^Al 

mn \ 


• ( 10 ) 


4. Exactly i edges hit a vertex of degree d — i, where i is between 2 and m. 
If no edges hit the neighbors of this vertex, then T n (d) is increased only by the 
number of triangles on this vertex. The probability to hit a vertex of degree d— i 
exactly i times is equal to O ^^ 2 ^. If we also hit its neighbors, then T n (d) is 
additionally increased by 1 for each neighbor. The probability to hit a vertex 
of degree d — i exactly i times and hit some its neighbor is, obviously, O 
Summing over all vertices of degree d — * and then summing over all i from 2 to 
m, we obtain that E T n (d) is increased by: 




■(d-t)-ENn(d 



= 0 



E T n {d) + O 



E N n (d). 


( 11 ) 


Finally, using i»-0 and the linearity of the expectation, we get 
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ET n+1 (d) = E T n (d) - ( +o(^\) E T n (d) 


+ 


+ 0(^)1 ET„(d-l) + (d-l)EiV n (d-l) — 


A{d-1) + B 


0 [ dES ^ d) ) + o ( E T n (d) + o(^\ E N n {d) 


= 1 - 


Ad + B 
n 


E T n {d) + A{d 1) + B E T n {d - 1) 


+ O ( (E T n {d) + E T n (d - 1)) + 0 E N n (d) 

+ — (d-l)E Nn ( d -l) + o( d ' ES i n ’ d) ) . ( 12 ) 


Consider the case 2A < 1 (the cases 2 A = 1 and 2 A > 1 will be analyzed 
similarly). We prove by induction on d and n that 


ET„(d) = K(d) (n + e(c- d 2+jt )) 


(13) 


for some constant C > 0. Let us assume that ETj(J) = K(d) (i + 9 (c ■ 
for d < d and all i and for d = d and i < n + 1. 

Recall that K(d) = c(m , d) [p + £ • J2i=L ~M+b ) and E N n (d) = c(m, d) ■ 

(ji + O If 2v4 < 1, then from (7) and Theorem 7 we get E S(n,d) = 

0{n) and we obtain: 


E T n+1 (d) = 1 - 


Ad + B 


+ 


A{d- 1 ) + B 


K(d) (n + e(cd 2+ ^ 

K(d - 1) (n + 9 (c(d - 1) 2+ ^)) 


+ O (jAj (K{d) (n + 9 ( 'cd 2+ ^)) + K(d -l )(n + 6 (c(d - l) 2+i ))) 
+ c(m,d) (ji + O 

H- (d — 1) c(m, d — 1) (n + O + O 


Note that K(d) = ^d+B+f K ( d ~ x ) + m(Ad+B+ 1 ) C ( TO ’ d ~ !)■ Therefore, 
obtain: 


we obtain: 















ET n+1 {d) = K(d)(n + 1) + K{d) ^1- Ad + B ^j 0(cd 2+ 

6 [c (d- 1) 2+ ^) 


+ K{d - 1 ) 
D(d- 1 ) 


A(d-1) + B 


(■ m,d )O (d 2+A j + O + O (-ft'(d) n 

+K(d) 6 (c d 2+ + K{d - 1) n + dT(d - 1) 0 (<7 (d - 1) 2+ ^)) 

+ ^c(m, d) n + c(to, d) O (d 2+ i^ . 


In order to show (13), it remains to prove that for some large enough C: 


K (d) M d + g > \ Cd 2+ ^ >/f(d-l) ^ 1) + g C(d^l) 2+ i 

\ n ) n 


o, ’£) +o ( c £) +o (£i- (i4 > 


First, we analyze the following difference: 


K{d) ( Ad ± B \ rf 2+i _ K y _ 1} V + B ( d _ l)2+i 
\ n J n 


Ad + B 2+ i fA(d-l) + B 


Ad + B + 1 


K(d - 1) + 


D(d — 1) 
m(Ad + 5 + 1) 


c(m, d — 1) 


A(d-l) + 5 - 1) (d - l) 2+ i = (Ad + 5)5(d~ 1) 2+i 

n 1 M j mn(Ad + B + 1) ^ ^ 

+ K(d 1) A(d ~ 1)+i? f + g d 2+ * - (d - 1) 2 H) 
v ’ n \Ad+B+l y ' ) 


Therefore, Equation (141 becomes: 


(Ad + B)D(d- 1) 
mn{Ad + B + 1) 


c(m, d — 1) d 2+ A >0 



+ 0 
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In the case 2 A = 1 this inequality will be: 


(Ad + B)D(d — 1) . . j2+4- i / \ 

° mn(Ad + B + 1) c( ”' " 1)1 ^ ‘ ‘ 0gW 

>offl + o( V 4 y 

\ n / \ n 2 

In the case 2^4 > 1 this inequality will be: 


■Ol'* 

n z 


O 


d log(n) 


c (^4- ^-1) i K + . 

mn(Ad + B + 1) 


- n 2A ~ l 


d 2 \ „ / „ d 4 n 2j4 1 


>o — + o c 



It is easy to see that for n> Q ■ d 2 (for some large Q which depends only on 
the parameters of the model) these three inequalities are satisfied. This concludes 
the proof of the theorem. 


4.2 Proof of Theorem [5] 

This theorem is proved similarly to the concentration theorem from m- We also 
need the following notation (introduced in [18]): 


p n (d) = P (d n v +1 = d | d n v = d) = 1 - A d - B 1 + O (^) , 

v 7 n n \n z J 

Pn{d) := P (C +1 = d+ 1 | d n v = d) = A- +B- + O (*) , 

n n \n z J 

Kid) := P (< +1 = d + j | = d) = O (J-^j , 2 <j<m, 

m 

Pn ■.= Y J PKX\=m + k)=0 

fe=1 

To prove Theorem [5] we also need the Azuma- Hoeffding inequality: 

Theorem 8 (Azuma, Hoeffding). Let (X,;)” =0 be a martingale such that \Xj — 

_ ' 

A'j_i| < Ci for any 1 < i < n. Then P (|X n — Xo| > x) < 2e 2E "=i c ? for any 
x > 0. 

Consider the random variables Xi(d) = E (T n (d) \ G l m ), i = 0Note 
that X 0 (d) = E T n (d) and X n (d) = T n (d). It is easy to see that X n (d) is a 
martingale. 

We will prove below that for any i = 0,..., n — 1 
(1) if 2 A < 1, then |Xj+i(d) — Xj(d)| < Md 2 , 
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(2) if 2 A = 1, then \X i+ i(d) — Xi(d)\ < Md 2 log(n), 

(2) if 1 < 2A < f, then |X i+ i(d) - X t {d)\ < Md 2 n 2A ~ 1 , 

where M > 0 is some constant. The theorem follows from this statement imme¬ 
diately. Indeed, consider the case 2 A < 1. Put q = Md 2 for all i. Then from 
Azuma-Hoeffding inequality it follows that 

P (| T n {d) - ET„(d)| >d 2 Vn log n) < 2exp |-} = n" fi(logri) . 

Therefore, for the case 2A < 1 the first statement of the theorem is satisfied. 
If d < n iA + 2 , then the value nd~ x ^ A is considerably greater than d 2 log ri\fn. 
From this the second statement of the theorem follows. The cases 2 A = 1 and 
2 A > 1 can be considered similarly. It remains to estimate |Xi + i(d) — Xi{d )|. 

Fix 0 < i < n — 1 and some graph G l m . Note that 

\E(T n (d)\G i + 1 )-E(T n (d)\G i m )\< 

< -max {E(T n (d) IG^ 1 )}- . min {e (r„(d) | G^ 1 ) } . 

Gm 1 DG i m 1 Gm 1 DG i m 1 ' > 

Put GJ+ 1 = argmaxE(T n (d) | GJ+ 1 ), G^ 1 ^ = argminE(T n (d) | GJ+ 1 ). It is 
sufficient to estimate the difference E(T„(d) | G lA1 ) — E (T n (d) \ G^ 1 ). 

For i + 1 < t < n put 

61(d) = E (T t (d) | G^ 1 ) - E (T t (d) | G^ 1 ). 

First, let us note that for n < W ■ d 2 (the value of constant W will be defined 
later) we have 5 l n (d) < ^ >»(»»-1) _|_ < 4m 2 n < Md 2 < Md 2 log(n) < 

Md 2 n 2A_1 (since we have at most vertices of degree d, and each vertex of 
degree d has at most triangles when this vertex appears plus at each 

step we get a triangle only if we hit both the vertex under consideration and a 
neighbor of this vertex, and our vertex degree is equal to d , therefore we get at 
most dm triangles) for some constant M which depends only on W and m. 

It remains to estimate 5 l n {d) for n > Wd 2 . Consider the case 2A < 1. We want 
to prove that 5 l n (d) < Aid 2 for n > Wd 2 by induction. Suppose that n = i + 1. 
Fix G' rn . Graphs G)+ 1 and G lA1 are obtained from the graph G l m by adding the 
vertex i + 1 and to edges. These to edges can affect the number of triangles on at 
most to previous vertices. For example, they can be drown to at most m vertices 
of degree d and decrease T) (d) by at most md G~A _ Such reasonings finally lead 
to the estimate (5* +1 (<i) < Aid 2 for some M. 

Now let us use the induction. Consider t: i + 1 < t < n — l, t > W d 2 (note 
that the smaller values of t were already considered). Using similar reasonings 
as in the proof of Theorem [4] we get: 

<Wto) = 5\{m) (1 -pt(m)) + O , 
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$+i(d) = S l t (d) (1 - Pt (d)) + S}{d - l) P l(d~l) 

+ (d ~ 1) • (E (N t (d - 1) | G^) - E (N t (d - 1) | 01)) 


O 


d-ES(t,d-l) 

t 2 


O 


E T t (d) ■ d 2 
t 2 


+ o 


D 
mt 
E N t (d)-d 3 
t 2 


Note that E(JV t (d) | G^ 1 ) - E (N t (d) | GJ+ 1 ) = O (d) (see [18]) and E5(t,d- 
1) = O (t). From this recurrent relations it is easy to obtain by induction that 
Sn(d) < Md 2 for some M. Indeed, 

Am + B Cl \ Gi 2 

+ —r H— £ < Mm 2 


(5J + i(m) < Mm 2 (1 — p t (m))+^- < Mm 2 ^1 — 


t 


t 2 


t 


for sufficiently large M. By Ci, i = 1,2,..we denote some positive constants. 
For d > m we get 

Sl +1 (d) < Md 2 { 1 - pt(d)) + M(d - l) 2 p\{d - 1) + C 3 y + G 4 ^ 


< Md 2 1 - 


Ad + B 
t 


+ ^5-^) + M{d - l) 2 ^ 


fA(d-l) + B 


c 6¥ )+c 3 - 


M 


+ G 4 — < Md H—— ( A(—3 d~ + 3 d — 1) + I?(— 2d + 1) + C 7 —— I- C 3 -yy 


t 


M 


Mt 


M 


C, 


+C 4 — < Md 2 + — -3A + C 7 — + + Ci— ■ d 


t M Mt , 

+ (3 A -2B)-d+(B- A)) < Md 2 . 


for sufficiently large W and M. 

In the case 2 A = 1 we have ES'(t, d— 1 ) = O (t log(f)) and we get the following 
inequalities: 

5\ +l (m) < Aim 2 log(t) (1 — pt{m)) + < ~' 1 < Mm 2 log(f + 1), 


S z t+1 {d) < Md 2 log(f)(l -Pt(d)) + M(d— l) 2 log {t)p\{d- 1) 

+ cS +c/Am. + < Mi - log(i + 1) . 

t t t z 

In the case 2 A > 1 we have ES'(t, d — 1) = O ( t 2A ) and we get the following 
inequalities: 


S l t+1 (m) < Mm 2 t 2A 1 (1 — Pt{iTi)) + 




2 A —1 


< Mm 2 (t + 1) 


2 A —1 


6 l t+1 (d) < Md 2 t 2A 1 (1 — pt(d)) + M(d — l) 2 t 2A x p\{d — 1) 

rp d-t 2A ~ l 

+ C2 T +C 3 - 

This concludes the proof of Theorem [5] 


J2 J+2A-1 j4^2A-1 

+ c£- + C 3 ^--+ C 4 ^V- < Md 2 (t + l) 2 *" 1 

t t t- 
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Fig. 1: The behavior of C{d ) 


5 Experiments 

In this section, we choose a three-parameter model from the family of polynomial 
graph models defined in m and analyze the local clustering coefficients C(d) and 
C 2 (n). First, we illustrate our results on C(d) which we proved in the previous 
section. In addition, we consider the case A > |, for which we do not have a 
theoretical proof. In this case, our approximation of C(d ) slightly deviates from 
the experiment. Finally, we discuss how the average local clustering coefficient 
C 2 (n) can be approximated. 


5.1 Local Clustering Coefficient C(d) 

First, we generated three polynomial graphs with n = 10 6 , m = 2, D = 0.3 
and different values of A. In other words, we fixed the probability of a triangle 
formation and vary the parameter of the power-law degree distribution. Detailed 
graph generation process is described in [15] , We choose A to be 0.25, 0.5 and 0.7, 
which corresponds to the three cases of Theorems [4] and [5j Also these cases cor¬ 
respond to three different types of a power-law degree distribution: with a finite 
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variance, with infinite variance and the border case with 7 = 3 . Figure [l] illus¬ 
trates our main result (Theorem [b]) . Here the theoretical value of C(d) is equal 


to 


K(d) 


2D 


( 2 ) c(m,d ) d(d—l)m 


[rn 


E d — 1 
i=r 


according to Theorem 


We have 


m Ai+B 

also considered the case A = 0.8 for which we do not have a theoretical proof. In 
this case, the experimental result is also close to the theoretical approximation. 
However, one can observe that our approximation slightly underestimates C(d) 
even for small values of d (see, e.g., d = 2 on Figure Id). This means that for 
A > | the error terms can make a significant contribution to the value of C(d) 
and it is probably impossible to get the accurate approximation for the whole 
T-subclass for such A. So, our restriction A < | is essential. In all four cases, 
the difference for large d can be explained by the error term. 


5.2 Average Local Clustering Coefficient 

In this section we empirically analyze the average local clustering coefficient for 
the PA-class of models. Recall that the average local clustering coefficient is 
defined as: C 2 (n) = - Y+=i C(i), where C{i) is the local clustering coefficient 
for a vertex i: C(i) = - 57 , T l is the number of triangles on the vertex i and P 2 is 
the number of pairs of neighbors. Also C 2 (n) can be represented in the following 

form: C 2 {n) = £ • ££L m UMl- 

_ 2 

Using Theorem PD we can approximate the expectation of C 2 {n): 


If, E T n (d) 
EC 2 (n) = - • 2^ ggzg- 


d—m 


= E 


2 D 

d{d - 1 ) 


d—m 

r( TO + g±i)r(rf+|) 

^r(m + f)r(d + ^±ii) 


d -1 


1 + - V 

m ' 


01 - 

n 


Ai + B 


= E m 


1+0 — 


(15) 


where: 


m = 


2D 


d{d- 1 ) 


1 d— 1 
1 ^ ^ 

m ^ Ai + B 

i=m 


r( m + n±i)r(d + f) 


AE 


f)r(V 


(B+A+l) \ 
A 


and 


(1) if 2 A < 1, then X = d 2+ ^, 

(2) if 2 A = 1, then X = d 2+ i ■ log(n), 
(2) if 2 A > 1, then X = d 2+ i ■ log (n 2A ). 
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Fig. 2: The behavior of 62 ( 71 ) as a function of A for n = 10 6 , m = 2, D = 0.3 


It is hard to compute J2T=m /(d) analytically. Moreover, it is impossible to 
prove that the error term YldLm /(d) ' O (^) behaves as o(l), since this series 
does not converge. Therefore, in this section we empirically analyze how well 
Y^dLm /(d) approximates the clustering coefficient 62 ( 71 ). Further in this section 
we consider the behavior of 62 ( 71 ) depending on A and on D. 


Average Local Clustering Coefficient 62 (n) depending on A. We 

generated polynomial graphs with n = 10 6 , m = 2, and D = 0.3, assigning 
A € [0.15,0.8]. For each value of A we generate 10 graphs and average the 
obtained values of 62 ( 77 ) (see Figure [ 2 ]). For A < 0.75 the theoretical value 
Y^d=m /(d) extremely close to the experiment and only for A = 0.8 we observe 
a small error. This is consistent with Figure [TJ where we demonstrated that our 
approximation of C(d) does not work for A > |. 


Average Local Clustering Coefficient 62(71) depending on D. We 

also generated polynomial graphs with n = 10 6 , m = 2, and A = 0.5, assigning 
D e [0.05,1]. Again, we average 62(71) over 10 graphs (see Figure [3]). For all 
D the theoretical value Y^d= m f(d) i s extremely close to the experiment. Also, 
it follows from Equation (151 that 62 (d) should depend linearly on D and our 
experiment confirmed it. 


Thus, our experiments suggest that for polynomial models we can approxi¬ 
mate the local clustering coefficient 62(71) by Y^dLm /(d) for A < |. 
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Fig. 3: The behavior of C^in) as a function of D for n = 10 6 , m = 2 , A = 0.5 


6 Conclusion 

In this paper, we study the local clustering coefficient C(d) for the vertices of 
degree d in the T-subclass of the PA-class of models. Despite the fact that the 
T-subclass generalizes many different models, we are able to analyze the local 
clustering coefficient for all these models. Namely, we proved that C(d) asymp¬ 
totically decreases as • d~ l . In particular, this result implies that one cannot 
change the exponent — 1 by varying the parameters A, D , and m. This basically 
means that preferential attachment models in general are not flexible enough to 
model C(d) ~ d~^ with i/j ^ 1. In addition, we suggested and empirically verified 
(for A < 0.75) an approximation for the local clustering coefficient C 2 {n). 

We would also like to mention the connection between the obtained behavior 
of C(d) and the notion of weak and strong transitivity introduced in [20]. It was 
shown in m that percolation properties of a network are defined by the type 
(weak or strong) of its connectivity. Interestingly, a model from the T-subclass 
can belong to either weak or strong transitivity class: if 2D < Am, then we obtain 
the weak transitivity; if 2D > Am, then we obtain the strong transitivity. 
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